24,985 research outputs found

    Mining Shared Decision Trees between Datasets

    Get PDF
    This thesis studies the problem of mining models, patterns andstructures (MPS) shared by two datasets (applications), a well understood dataset, denoted as WD, and a poorly understood one, denoted as PD. Combined with users\u27 familiarity with WD, the shared MPS can help users better understand PD, since they capture similarities between WD and PD. Moreover, the knowledge on such similarities can enable the users to focus attention on analyzing the unique behavior of PD. Technically, this thesis focuses on the shared decision tree mining problem. In order to provide a view on the similarities between WD and PD, this thesis proposes to mine a high quality shared decision tree satisfying the properties: the tree has (1) highly similar data distribution and (2) high classification accuracy in the datasets. This thesis proposes an algorithm, namely SDT-Miner, for mining such shared decision tree. This algorithm is significantly different from traditional decision tree mining, since it addresses the challenges caused by the presence of two datasets, by the data distribution similarity requirement and by the tree accuracy requirement. The effectiveness of the algorithm is verified by experiments

    Criticality and Continuity of Explosive Site Percolation in Random Networks

    Full text link
    This Letter studies the critical point as well as the discontinuity of a class of explosive site percolation in Erd\"{o}s and R\'{e}nyi (ER) random network. The class of the percolation is implemented by introducing a best-of-m rule. Two major results are found: i). For any specific mm, the critical percolation point scales with the average degree of the network while its exponent associated with mm is bounded by -1 and ∼−0.5\sim-0.5. ii). Discontinuous percolation could occur on sparse networks if and only if mm approaches infinite. These results not only generalize some conclusions of ordinary percolation but also provide new insights to the network robustness.Comment: 5 pages, 5 figure
    • …
    corecore